The Effects Of Human Variation In DUC Summarization Evaluation
نویسندگان
چکیده
There is a long history of research in automatic text summarization systems by both the text retrieval and the natural language processing communities, but evaluation of such systems’ output has always presented problems. One critical problem remains how to handle the unavoidable variability in human judgments at the core of all the evaluations. Sponsored by the DARPA TIDES project, NIST launched a new text summarization evaluation effort, called DUC, in 2001 with follow-on workshops in 2002 and 2003. Human judgments provided the foundation for all three evaluations and this paper examines how the variation in those judgments does and does not affect the results and their interpretation.
منابع مشابه
Focused multi-document summarization: Human summarization activity vs. automated systems techniques
Focused Multi-Document Summarization (MDS) is concerned with summarizing documents in a collection with a concentration toward a particular external request (i.e. query, question, topic, etc.), or focus. Although the current state-of-the-art provides somewhat decent performance for DUC/TAC-like evaluations (i.e. government and news concerns), other considerations need to be explored. This paper...
متن کاملThe LIA summarization system at DUC-2007
This paper presents the LIA summarization systems participating to DUC 2007. This is the second participation of the LIA at DUC and we will discuss our systems in both main and update tasks. The system proposed for the main task is the combination of seven different sentence selection systems. The fusion of the system outputs is made with a weighted graph where the cost functions integrate the ...
متن کاملRe-using High-quality Resources for Continued Evaluation of Automated Summarization Systems
In this paper we present a method for re-using the human judgements on summary quality provided by the DUC contest. The score to be awarded to automatic summaries is calculated as a function of the scores assigned manually to the most similar summaries for the same document. This approach enhances the standard n-gram based evaluation of automatic summarization systems by establishing similariti...
متن کاملThe Hong Kong Polytechnic University at DUC2005
This paper discusses the query-based multidocument summarization techniques implemented by the Hong Kong Polytechnic University at DUC 2005. The summarization system is built under the framework of MEAD. In addition to borrow the features provided by MEAD for text summarization, including centroid and sentence length etc., we also introduce the entity-based, pattern-based, termbased and semanti...
متن کاملAutomatic Evaluation Of Summaries Using Document Graphs
Summarization evaluation has been always a challenge to researchers in the document summarization field. Usually, human involvement is necessary to evaluate the quality of a summary. Here we present a new method for automatic evaluation of text summaries by using document graphs. Data from Document Understanding Conference 2002 (DUC-2002) has been used in the experiment. We propose measuring th...
متن کامل